15 research outputs found

    Composing Graph Theory and Deep Neural Networks to Evaluate SEU Type Soft Error Effects

    Full text link
    Rapidly shrinking technology node and voltage scaling increase the susceptibility of Soft Errors in digital circuits. Soft Errors are radiation-induced effects while the radiation particles such as Alpha, Neutrons or Heavy Ions, interact with sensitive regions of microelectronic devices/circuits. The particle hit could be a glancing blow or a penetrating strike. A well apprehended and characterized way of analyzing soft error effects is the fault-injection campaign, but that typically acknowledged as time and resource-consuming simulation strategy. As an alternative to traditional fault injection-based methodologies and to explore the applicability of modern graph based neural network algorithms in the field of reliability modeling, this paper proposes a systematic framework that explores gate-level abstractions to extract and exploit relevant feature representations at low-dimensional vector space. The framework allows the extensive prediction analysis of SEU type soft error effects in a given circuit. A scalable and inductive type representation learning algorithm on graphs called GraphSAGE has been utilized for efficiently extracting structural features of the gate-level netlist, providing a valuable database to exercise a downstream machine learning or deep learning algorithm aiming at predicting fault propagation metrics. Functional Failure Rate (FFR): the predicted fault propagating metric of SEU type fault within the gate-level circuit abstraction of the 10-Gigabit Ethernet MAC (IEEE 802.3) standard circuit.Comment: 5 pages for conference, Number of figures: 3, Conference: 2020 9th Mediterranean Conference on Embedded Computing (MECO

    Machine Learning to Tackle the Challenges of Transient and Soft Errors in Complex Circuits

    Full text link
    The Functional Failure Rate analysis of today's complex circuits is a difficult task and requires a significant investment in terms of human efforts, processing resources and tool licenses. Thereby, de-rating or vulnerability factors are a major instrument of failure analysis efforts. Usually computationally intensive fault-injection simulation campaigns are required to obtain a fine-grained reliability metrics for the functional level. Therefore, the use of machine learning algorithms to assist this procedure and thus, optimising and enhancing fault injection efforts, is investigated in this paper. Specifically, machine learning models are used to predict accurate per-instance Functional De-Rating data for the full list of circuit instances, an objective that is difficult to reach using classical methods. The described methodology uses a set of per-instance features, extracted through an analysis approach, combining static elements (cell properties, circuit structure, synthesis attributes) and dynamic elements (signal activity). Reference data is obtained through first-principles fault simulation approaches. One part of this reference dataset is used to train the machine learning model and the remaining is used to validate and benchmark the accuracy of the trained tool. The presented methodology is applied on a practical example and various machine learning models are evaluated and compared

    Machine Learning Clustering Techniques for Selective Mitigation of Critical Design Features

    Full text link
    Selective mitigation or selective hardening is an effective technique to obtain a good trade-off between the improvements in the overall reliability of a circuit and the hardware overhead induced by the hardening techniques. Selective mitigation relies on preferentially protecting circuit instances according to their susceptibility and criticality. However, ranking circuit parts in terms of vulnerability usually requires computationally intensive fault-injection simulation campaigns. This paper presents a new methodology which uses machine learning clustering techniques to group flip-flops with similar expected contributions to the overall functional failure rate, based on the analysis of a compact set of features combining attributes from static elements and dynamic elements. Fault simulation campaigns can then be executed on a per-group basis, significantly reducing the time and cost of the evaluation. The effectiveness of grouping similar sensitive flip-flops by machine learning clustering algorithms is evaluated on a practical example.Different clustering algorithms are applied and the results are compared to an ideal selective mitigation obtained by exhaustive fault-injection simulation

    Radiation-Hardening-By-Design (RHDB) and modeling of single event effects in digital circuits manufactured in Bulk 65 nm and FDSOI 28 nm

    No full text
    La miniaturisation des circuits intégrés numériques tend à augmenter leur sensibilité aux radiations. Ainsi le rayonnement naturel peut induire des événements singuliers et porter atteinte à la fiabilité des circuits.Cette thèse porte sur la modélisation des mécanismes à l'origine de ces événements singuliers et sur le développement de solutions de durcissement par conception permettant de limiter l'impact des radiations sur le taux d'erreur.Dans une première partie, nous avons notamment développé une approche dénommée RWDD (Random-Walk Drift- Diffusion) modélisant le transport et la collection de charges au sein d'un circuit, sur la base d'équations physiques sans paramètre d'ajustement. Ce modèle particulaire et sa résolution numérique transitoire permettent de coupler le transport des charges avec un simulateur circuit, tenant ainsi compte de l'évolution temporelle des champs électriques dans la structure. Le modèle RWDD a été intégré avec succès dans une plateforme de simulation capable d'estimer la réponse d'un circuit suite à l'impact d'une particule ionisante.Dans une seconde partie, des solutions de durcissement permettant de limiter l'impact des radiations sur la fiabilité des circuits ont été développées. A l'échelle des cellules élémentaires, de nouvelles bascules robustes aux radiations ont été proposées, en limitant leur impact les performances. Au niveau système, une méthodologie de duplication de l'arbre d'horloge a été développée. Enfin, un flot de triplication a été conçu pour les systèmes dont la fiabilité est critique. L'ensemble de ces solutions a été implémenté en technologie 65 nm et UTBB-FDSOI 28 nm et leur efficacité vérifiée expérimentalement.The extreme technology scaling of digital circuits leads to increase their sensitivity to ionizing radiation, whether in spatial or terrestrial environments. Natural radiation can now induce single event effects in deca-nanometer circuits and impact their reliability.This thesis focuses on the modeling of single event mechanisms and the development of hardening by design solutions that mitigate radiation threat on the circuit error rate.In a first part of this work, we have developed a physical model for both the transport and collection of radiation-induced charges in a biased circuit, derived from pure physics-based equations without any fitting parameter. This model is called Random-Walk Drift-Diffusion (RWDD). This particle-level model and its numerical transient solving allows the coupling of the charge collection process with a circuit simulator, taking into account the time variations of the electrical fields in the structure. The RWDD model is able to simulate the behavior of a circuit following a radiation impact, independently of the implemented function and the considered technology.In a second part of our work, hardening solutions that limit radiation impacts on circuit reliability have been developed. At elementary cell level, new radiation-hardened latch architectures have been proposed, with a limited impact on performances. At system level, a clock tree duplication methodology has been proposed, leaning on specific latches. Finally, a triplication flow has been design for critical applications. All these solutions have been implemented in 65 nm and UTBB-FDSOI 28nm technologies and radiation test have been performed to measure their hardening efficiency

    Functional Failure Rate Due to Single-Event Transients in Clock Distribution Networks

    No full text
    With technology scaling, lower supply voltages, and higher operating frequencies clock distribution networks become more and more vulnerable to transients faults. These faults can cause circuit-wide effects and thus, significantly contribute to the functional failure rate of the circuit. This paper proposes a methodology to analyse how the functional behaviour is affected by Single-Event Transients in the clock distribution network. The approach is based on logic-level simulation and thus, only uses the register-transfer level description of a design. Therefore, a fault model is proposed which implements the main effects due to radiation-induced transients in the clock network. This fault model enables the computation of the functional failure rate caused by Single-Event Transients for each individual clock buffer, as well as the complete network. Further, it allows the identification of the most vulnerable flip-flops related to Single-Event Transients in the clock network. The proposed methodology is applied in a practical example and a fault injection campaign is performed. In order to evaluate the impact of Single-Event Transients in clock distribution networks, the obtained functional failure rate is compared to the error rate caused by Single-Event Upsets in the sequential logic

    On the Estimation of Complex Circuits Functional Failure Rate by Machine Learning Techniques

    No full text
    De-Rating or Vulnerability Factors are a major feature of failure analysis efforts mandated by today's Functional Safety requirements. Determining the Functional De-Rating of sequential logic cells typically requires computationally intensive fault-injection simulation campaigns. In this paper a new approach is proposed which uses Machine Learning to estimate the Functional De-Rating of individual flip-flops and thus, optimising and enhancing fault injection efforts. Therefore, first, a set of per-instance features is described and extracted through an analysis approach combining static elements (cell properties, circuit structure, synthesis attributes) and dynamic elements (signal activity). Second, reference data is obtained through first-principles fault simulation approaches. Finally, one part of the reference dataset is used to train the Machine Learning algorithm and the remaining is used to validate and benchmark the accuracy of the trained tool. The intended goal is to obtain a trained model able to provide accurate per-instance Functional De-Rating data for the full list of circuit instances, an objective that is difficult to reach using classical methods. The presented methodology is accompanied by a practical example to determine the performance of various Machine Learning models for different training sizes.Comment: arXiv admin note: text overlap with arXiv:2002.0888

    Machine Learning To Tackle the Challenges of Transient and Soft Errors in Complex Circuits

    No full text
    The Functional Failure Rate analysis of today's complex circuits is a difficult task and requires a significant investment in terms of human efforts, processing resources and tool licenses. Thereby, de-rating or vulnerability factors are a major instrument of failure analysis efforts. Usually computationally intensive fault-injection simulation campaigns are required to obtain a fine-grained reliability metrics for the functional level. Therefore, the use of machine learning algorithms to assist this procedure and thus, optimising and enhancing fault injection efforts, is investigated in this paper. Specifically, machine learning models are used to predict accurate per-instance Functional De-Rating data for the full list of circuit instances, an objective that is difficult to reach using classical methods. The described methodology uses a set of per-instance features, extracted through an analysis approach, combining static elements (cell properties, circuit structure, synthesis attributes) and dynamic elements (signal activity). Reference data is obtained through first-principles fault simulation approaches. One part of this reference dataset is used to train the machine learning model and the remaining is used to validate and benchmark the accuracy of the trained tool. The presented methodology is applied on a practical example and various machine learning models are evaluated and compared
    corecore